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Preface 


This  is  the  final  report  on  work  performed  by  L.N.K.  Corpora¬ 
tion  under  contract  DAAK  70-79-C-0091.  Dr.  John  W.  Bond,  Jr., 
DRDME-RT,  U.S.  Army  Mobility  Equipment  Research  and  Development 
Command  served  as  contract  technical  monitor.  The  report  was 
prepared  by  L.N.K.  scientists,  Mr.  David  Lavine,  Ms.  Barbara  Lambird 
and  Dr.  Laveen  N.  Kanal.  Pattern  description  and  reasons  for  per¬ 
forming  this  study  are  classified.  The  classified  explanation  can 
be  obtained  from  Dr.  Bond,  telephone  number  (703)  664-4547,  Autovon 
354-5375  or  4547. 
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Summary 


This  is  the  final  report  on  work  performed  by  L.N.K.  Corpora¬ 
tion  under  contract  DAAK  70-79-C-0091  from  the  U.S.  Army  Mobility 
Equipment  Research  and  Development  Command. 

,'y  Two  basic  types  of  planar  point  patterns  were  investigated. 
The  first  type  is  specified  by  a  set  of  points  in  a  plane  and  a 
noise  model  which  determines  the  allowed  perturbations  of  the  points 
subject  only  to  the  restriction  that  no  interpoint  distance  should 
change  by  more  than  a  certain  percentage.  The  second  type  is 
specified  by  a  set  of  points  subject  to  angle  and  distance  con¬ 
straints. 

Representations  of  point  patterns  having  some  invariance 
properties  are  presented.  For  the  first  type  of  prototype  patterns, 

V  Vf  •  '  r  ; 

these  Include  SIDV' s  ^  Sorted  Interpoint  Distance^Veetor,/  SNN's  - 

\  * 

^Sorted  Nearest  Neighbor  Vectors/,  and  MST's  -^Minimal  Spanning 
Trees).  Theorems,  experimental  results  of  simulations,  advantages, 
disadvantages  and  comparisons  are  presented  for  classification 
precedures  based  on  these  representations.  It  is  concluded  that 
classification  of  the  first  type  of  patterns  can  be  accomplished 
using  the  methods  based  on  SIDVs,  SNN's  and  MST's  under  certain 
restrictions.  If  the  number  of  points  in  the  pattern  is  small, 
less  than  100,  then  the  SIDV  method  would  be  better,  while  more 
than  100,  then  the  SNN  or  methods  based  on  MST's  would  be  more  use¬ 
ful.  If  the  percentage— of  additions  or  deletions  of  points  Is 

!  >o%]  J 

small,  of  the  order  of/ ten  percent^  then  the  SDV,  SNN,  and  MST  are 
still  feasible.  If  the  percentage  is  large, then  other  methods 

must  be  used. 
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The  techniques  used  to  classify  prototype  point  patterns  give 


a  measure  of  the  quality  of  a  match  which  can  be  used  to  provide 
a  measure  of  the  confidence  on  the  classification  of  a  particular 
pattern. 

For  the  second  type  of  point  prototype  specified  by  a  set 
of  constraints,  only  one  example  was  provided.  An  algorithm  that 
could  recognize  any  instance  of  this  pattern  is  presented. 

Suggestions  for  further  analytical  and  experimental  investi¬ 
gation  of  classification  scheme's  when  the  number  of  additions  and 
deletions  of  pattern  points  becomes  large  are  given  in  the  final 
section  of  the  report.  Testing  of  all  the  devised  classification 
schemes  on  actual  noisy  data  from  point  patterns  is  also  recom¬ 
mended. 
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1. 


Introduction 


This  final  report  discusses  the  research  performed  by  L.N.K. 
Corporation  for  MERADCOM’s  pattern  investigation.  The  research 
included  identifying  important  types  of  patterns,  investigating 
invariants  of  the  patterns, developing  algorithms  for  finding  these 
invariants,  and  finally  developing  and  testing  procedures  for 
recognizing  point  patterns. 

1.1  Basic  Problem 

It  was  assumed  that  extraction  techniques  for  the  point 
patterns  had  been  successfully  developed  so  that  the  patterns  con¬ 
sisted  of  bright  blobs  embedded  in  a  background  of  small  to  moderate 
noise.  The  noise  appears  in  the  patterns  as  relatively  few  blobs 
not  as  bright  as  the  pattern  blobs.  The  objective  is:  given  a 
window  with  bright  spots  embedded  in  a  background,  recognize  and 
classify  the  pattern.  A  formal  statement  of  the  problem  is  pre¬ 
sented  in  Appendix  I. 

1.2  Normal  Observer  and  Planar  Terrain 

At  first,  it  is  assumed  that  the  points  of  the  pattern  occur 
in  a  flat  or  planar  terrain  and  that  the  observation  platform  is 
normal  to  this  plane.  Patterns  extracted  under  such  conditions 
are  refered  to  as  ideal  patterns.  These  patterns  could  have  added 
points  due  to  noise  but  no  deleted  points. 


1. 3  Non-Normal  Observer  and  Planar  Terrain 

The  terrain  is  still  considered  to  be  flat,  but  now  the  obser¬ 
vation  platform  can  be  far  from  normal.  All  the  points  in  the 
pattern  are  still  visible,  i.e.  there  is  no  fusion  or  coalescence 
of  points.  This  case  was  not  considered  any  different  from  the 
above  case  since  there  are  standard  techniques  for  correcting  per¬ 
spective  distortions  given  the  polar  angle.  For  example  see  Sec  6.3 
of  Digital  Picture  Processing  by  Azriel  Rosenfeld  and  Avnash  C. 

Kak  (Academic  Press,  1976). 

1. 4  Non-Planar  Terrain 

If  the  terrain  is  not  flat,  then  points  can  be  masked  from 
the  observation  platform  due  to  the  terrain, or  points  may  coalesce. 
Using  a  terrain  database,  rectification  of  the  image  to  the  normal 
viewing  angle  is  possible,  but  now  the  possibility  of  deletions  of 
pattern  points  must  be  taken  into  account. 

1. 5  Translation,  Rotation,  and  Scaling  of  Point  Patterns 
Since  the  observed  patterns  can  have  arbitrary  orientation 

and  location  in  the  plane,  the  classification  schemes  should  work 
regardless  of  any  translations,  rotations,  or  scaling.  All  reported 
classification  schemes  are  Invariant  under  rotation  and  translation. 
It  is  assumed  that  the  altitude  of  the  observation  platform  is 
known,  so  that  scaling  to  a  standard  altitude  is  done  on  the  point 
patterns  before  applying  the  recognition  techniques. 
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2. 


Two  basic  types  of  point  patterns  were  investigated.  The 
first  type  of  pattern  is  characterized  by  a  prototype  pattern  which 
is  specified  by  a  set  of  points  embedded  in  a  plane  and  a  noise 
model  which  determines  the  allowed  perturbations.  The  second  type 
of  pattern  is  specified  by  a  set  of  angle  and  distance  constraints. 

2.1  Prototype  Patterns 

Because  actual  samples  of  point  patterns  were  generally 
unavailable,  the  emphasis  of  the  investigation  was  on  prototype 
patterns.  These  patterns  are  specified  as  a  set  of  "ideal"  patterns 
where  each  pattern  is  defined  by  a  set  of  points  embedded  in  a 
plane.  The  noise  model  then  determines  to  what  extent  the  points 
can  move  around  in  the  plane.  In  this  type  of  pattern,  the  number 
of  points  is  considered  fixed.  Results  of  the  investigation  on 
the  prototype  patterns  are  given  in  Sections  3  to  6. 


2. 2  Pattern  Specified  b.v  a  Set  of  Constraints 


This  type  of  pattern  is  specified  by  a  set  of  constraints. 


The  constraints  define  regions  specified  by  a  set  of  angles,  define 
the  number  of  points  possible  in  a  region  and  can  also  limit  the 
allowed  interpoint  distances.  Thus,  the  number  of  points  in  such 
a  pattern  is  not  necessarily  fixed. 

One  example  of  this  type  of  pattern  was  made  available  to  us 
and  the  results  of  investigating  this  pattern  are  discussed  in 
Section  7.  To  investigate  these  types  of  patterns  more  generally, 
many  more  patterns  need  to  be  provided  in  order  to  be  able  to  cate¬ 
gorize  the  types  of  constraints  that  occur  in  such  point  patterns. 


3. 


Invariants  of  Prototype  Patterns 


The  basic  approach  used  for  matching  each  prototype  pattern 
to  its  noisy  versions  is  to  define  a  way  for  representing  both  the 
prototype  and  noisy  versions.  These  representations  are  then  com¬ 
pared  to  determine  whether  or  not  they  correspond  to  the  same 
pattern.  Some  of  the  representations  involve  transformations  that 
lose  information  so  that  it  is  not  possible  to  reconstruct  the 
original  point  pattern  from  the  representation.  Generally,  the 
greater  the  loss  of  information  in  a  representation,  the  simpler 
the  representation  and  the  easier  it  is  to  compare  representations. 

The  most  fundamental  representation  of  a  point  pattern,  which 
is  invariant  under  rotation  and  translation,  is  the  interpoint 
distance  matrix.  For  a  point  set  A  =  {  ai,...,aN)  a^e  R  ,  the 

interpoint  distance  matrix  is  defined  to  be  the  NxN  matrix  whose 
t/  h 

ij  entry  is  the  distance  from  point  aA  to  point  aj. 

Ex.  3.1  Let  P  =  {  a^jajja^} 

be  a  point  pattern  where 

ax  =  (0,0) 

.  a 2  =  (0,4) 

=  (3,0) 

a3t 


0 


H 


3 


0 


Then  IDM(P)  = 


0 

5 


This  representation  is  fundamental  because  it  is  possible  to 
reconstruct  the  original  point  set  from  the  matrix  except  for  a 
translation,  rotation  and  reflection.  Since  the  interpoint  distance 
matrix,  IDM,  and  the  original  point  set  are  equivalent  in  the  above 
sense,  it  is  natural  to  compare  patterns  by  comparing  their  inter¬ 
point  distance  matrices.  Unfortunately,  many  IDM’ S  can  be  derived 
from  one  point  set,  depending  on  the  order  in  which  the  points  are 
processed. 


Ex.  3.2  Let  S  =  {(0,0),  (3,0),  (0,4)}  be  a  three  point  pattern. 

There  are  six  possible  labellings  of  these  points  by 
labels  a^,  &2  and  a3‘  We  show  these  six  labellings  and 
the  corresponding  interpoint  distance  matrix  whose  ijth 
entry  is  the  distance  from  point  i  to  point  j . 

Case  1  Labelling  Interpoint  Distance  Matrix 
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The  uncertainty  in  a  IDM  is  determined  up  to  a  permutation 
of  the  entries  in  the  matrix.  Let  A  =  ( a\i' '  ’  aiN\ 

\ aNl ' *  *  aNN  / 

be  an  interpoint  distance  matrix  for  pattern  P.  Let  Sn  be  the 
set  of  all  permutations  of  the  integers  {  1,...,N  }  .  Then  the  set 
of  all  interpoint  distance  matrices  for  the  pattern  P  is  given  by 


3-3 


I  I  air(l)ir(l)  •••  au (1  )ir (N) 
(  \atf  (N  jir  (1 )  •••  air  (N*)tt  (N) 
Ex.  3.3  Let  Sn  =  {lr2>  •  •  •  where 


1-1 
2  ♦  3 
3-2 

1-2 
2  -  3 
3-1 


/  1  "  2  \  /  1  -  2  \ 

71 2  ’’  I  2  ^  L  1  77  5  '  J  2  3  J 

\3 "  v  V3  "  1  / 

/  1  ^  3  \  /!-  3\ 

V  2  "  1  :  2  "  2 

\3"2/  \  3  ^  1  / 

Let  a1=(0,4)  a2  -  (3,0)  and  a3  -  (0,0)  (case  3) 
As  in  example  3.2,  the  I DM  is 

(l  l  Dll'!  •") 


all 

a12 

a13 

a21 

a22 

a23 

a3l 

a32 

a33 

\3  **  °y  \a3i  a32 

The  permuted  IDM  corresponding  to  Tt2  is 

(a^2^1)1I2(1>  aTT2(l)Tr2(2) 

air2  (2  )tt2  (1 )  a7T2(2)7r2(2) 

a7r2^3)7r2(l)  a^2  (3)^2  (2 ) 


air2(l)TT2(3) 

atr2(2)7r2(3) 
a^2  <3 )tt2  ( 3  ) 


0 


5 

0 
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Note  that  this  is  just  case  6  in  example  3-2 
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3. 1  Comparing  Two  Interpoint  Distance  Matrices 

Since  the  set  Sn  contains  n!  elements,  the  problem  of  compar¬ 
ing  two  IDM's  can  be  formidable. 

In  the  simplest  possible  comparison,  we  are  given  two  inter¬ 
point  distance  matrices  and  ask  if  they  correspond  to  the  same 
unperturbed  point  set,  which  involves  comparing  n!  permutations. 

A  more  realistic  and  complex  version  of  the  problem  involves 
comparing  two  IDM's  determined  from  two  different  point  patterns. 

The  problem  is  to  decide  if  one  of  the  point  patterns  is  a  pertur¬ 
bation  of  the  other.  The  comparison  is  done  by  finding  the  permu¬ 
tation  of  the  best  component  by  component  fit  with  the  entries  of 
the  other  matrix.  This  search  might  be  performed  by  doing  a  least 
squares  fit  of  one  matrix  with  each  permutation  of  the  other  matrix 
and  selecting  the  permutation  yielding  the  smallest  sum  of  least 
square  values.  Since  this  procedure  is  computationally  expensive, 
other  algorithms  were  sought  to  provide  approximate  solutions  to 
the  problem. 

Ex.  3.4  Consider  the  pattern  R  =  (  (0,0),  (3.1,0),  (0,4.2)} 

The  interpoint  distance  matrix  is 
/0  3.1  5.2 

AJ3.1  o  4.2 

\5 . 2  4.2  0 

Let  A  and  B  be  3x3  matrices  whose  ij  components  are  a^j 
and  bij  respectively.  Define  d(A,B),  the  distance 
between  A  and  B  by 

Is  ^ 

d(A,B)  »  J  I  l  (A^-Bm)2 
VJ«1  1-1  J  J 
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To  determine  the  closeness  of  pattern  R  and  pattern  S 

of  Ex.  3-2,  we  compute  dCA.S^  for  i  «  where 

the  S1  are  the  IDM's  of  S.  We  see  that  A  is  closest 

2  2 

to  case  1  and  the  distance  is  .1  +.2  +.2 
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3. 2  Canonical  Forms  of  the  Interpoint  Distance  Matrix 

One  approach  to  finding  approximate  pattern  matches  is  to 
compute  new  representations  from  the  IDM's  and  compare  the  patterns 
by  using  the  new  representations.  These  representations  are 
invariant  under  permutations  in  Sn.  This  means  two  interpoint 
distance  matrices,  which  differ  only  by  a  permutation  belonging  to 
Sn,  will  produce  the  same  representation. 

One  such  class  of  representations  is  a  polynomial  on  the 

entries  of  the  matrix  which  does  not  change  in  value  if  the  matrix 

is  permuted.  A  simple  example  of  such  a  polynomial  is  the  product 

of  all  the  non-diagonal  entries  in  the  matrix.  The  construction 

n 

of  polynomials  defined  on  R  and  invariant  under  certain  types  of 
permutations  has  been  extensively  studied  [l]. 

Ex.  3.5  Let  S  *  1(0,0),  (3,0),  (0,4)}  be  the  pattern  in  example 
3.2.  In  each  of  the  six  IDM's  for  this  pattern  the 
product  of  the  non-diagonal  elements  is  3' 3’ 4' 4*  5’ 5  = 

3,600.  Note  that  this  number  is  independent  of  the 
choice  of  IDM's. 

Several  problems  arise  with  the  use  of  invariant  polynomials. 
First,  such  invariant  polynomials  are  non-trivial  to  find.  Second, 
the  invariant  polynomials  involve  multiplication  of  interpoint 
distances,  so  it  is  impossible  to  determine  how  much  the  individual 
components  contributed. 

To  avoid  these  difficulties  we  sought  a  different  type  of 
invariant.  To  each  IDM  we  associate  the  vector  whose  entries  are 
the  elements,  {a^  }  j>i,  of  the  matrix  but  sorted  in  increasing 
order.  We  denote  this  sorted  interpoint  distance  vector  by  SIDV. 
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A  SIDY  is  invariant  under  permutations  of  ^he  IDM  from  which  it  is 
derived.  In  the  following  example  the  two  inequivalent  IDM'6 

give  rise  to  the  same  SIDV.  We  have  been  unable  to  determine 

if  this  is  true  for  point  patterns  with  more  than  five  points. 


Ex.  3-6 


The  usefulness  of  the  SIDV  is  based  on  the  following  obser¬ 
vations: 

1)  In  practice  we  have  not  encountered  distinct  patterns  with 
nearly  identical  SIDV's. 

2)  The  SIDV  is  completely  determined  by  the  set  of  points  and 
not  upon  the  order  In  which  they  are  processed. 

3)  Noisy  versions  of  a  pattern  have  SIDV's  similar  to  that  of 
the  original  pattern. 

4)  SIDV's  can  be  used  to  define  a  similarity  measure  between 
patterns . 

Before  elaboration  on  the  above  points,  we  mention  a  problem 
with  this  approach.  The  comparison  of  two  SIDV's  becomes  very  ex¬ 
pensive  if  the  cardinalities  of  the  corresponding  point  sets  differ. 


4 .  The  Sorted  Interpoint  Distance  Vector 

The  SIDV  was  useful  for  several  classification  algorithms. 
These  procedures  provide  a  measure  of  the  quality  of  a  match 
between  a  noisy  pattern  and  each  ideal  pattern.  Initially, 
we  assume  that  a  noisy  pattern  is  obtained  by  perturbing 
points  in  the  ideal  pattern,  so  points  are  neither  added  nor 
deleted.  Later,  this  restriction  is  lessened  slightly,  but  a 
large  number  of  additions  or  deletions  will  require  other 
types  of  algorithms. 


4 . 1  Theory 

Def .  Let  P  =  {a^,...,a  }  be  a  pattern.  The  Interpoint 

distance  matrix,  IDM(P),  of  P  is  defined  to  be  the 
n  x  n  matrix  whose  ijth  entry  is  the  Euclidean  dis¬ 
tance  from  a.^  to  aj  . 

Def.  The  sorted  Interpoint  distance  vector,  SIDV,  of  the 

n2_n 

pattern  P  *  {a^,...,an)  is  the  vector  of  size  — 
whose  entries  are  the  elements  {  IDMtP)^}  in  Increa¬ 
sing  order  where  IDM(P)^  is  the  ijth  entry  of  IDM(P) 


Ex.  4.1  Let  S  =  {(0,0),  (3,0),  (0,4)}  be  the  pattern 

of  example  3.2.  Its  interpoint  distance  matrix 
is:  l  o  3  4 

3  0  5 

4  5  0 
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The  entries  above  the  diagonal  are  3,  4,  and  5. 

Thus  these  elements,  sorted  in  increasing  order 
give: 

SIDV(S)  *  (  3,  4,  5  ). 

Our  first  point  pattern  classification  procedure  SIDV  com¬ 
pares  a  pattern  P  with  an  ideal  pattern  P  by  comparing  their 
vectors  SIDV(P)  and  SIDV(P^)  componentwise.  P  and  P^  are  said 
to  be  an  a-percent  match  if 

|SIDV(P1)J  -  SIDV(P)j  | 

-  <_  a 

SIDV(P1)J 

for  all  j ,  where  the  subscript  j  denotes  the  component  of 
the  vector.  We  now  show  that  SIDV  is  an  a-percent  regular 
point  pattern  classification  procedure.  (See  Appendix  I  for 
the  definition  of  an  a-  percent  regular  point  pattern  classi¬ 
fication  procedure.  )  This  Is  proved  in  Theorem  I. 

This  theorem  guarantees  that  If  we  take  a  sorted  list  of  numbers, 
perturb  each  by  no  more  than  a-percent  and  sort  this  list  then  it 

must  match  the  original  list  component  by  component  to  within 

a-percent.  Thus  if  the  first  list  Is  the  SIDV  for  a  pattern  and 

the  second  is  the  SIDV  for  an  a-percent  perturbation,  then  the 

lists  must  match  component  by  component  to  within  a-percent. 
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Thm  I  Let  A  =  {alt...am}  and  B  *  {blt...,hm}  where  ajcR, 

be  R  for  1*1, ...  ,m  and  a-j^  <  a2  <  •  •  •  <  am  an(i  bi  *b2  i 
<  b  Let  M  *  {  1, . . . ,m}  and  assume  there  exists  a  1-1 

•  *  •  -s  m 


and  onto  map  f:  M-*-M  such  that 

1  ai  ~  bni)  L  a 


for  1  =  1, ...  ,m 


Then 


a^  -  bj 


<  a  for  1  =  1, ... ,m. 


Proof:  If  f vi)  *  1  for  all  1<1  <  m  then  we  are  done.  Thus 

we  are  reduced  to  showing  that  the  result  holds  for  any 
i0  such  that  f_1(l0)  <  1Q  of  f"1(io)>  V 
Case  1)  c=  f"'1(i0'  <10- 
Hence  ac  <a1  . 

Thus  b.  <  (1+ a  )a  </l+a  )ai  • 

Subcase  1)  g  =  f(iQ)  <io 

b  >  b  >  (1-  a  )a 
i0  S  *  *0 

Subcase  2)  g  =  f ( i0 ) >  1q 

bi  «  be  +  a  >  ai 

o  &  o 

Claim:  there  exists  a  J0>  1Q  such  that 

f  ( J  o  )  <  f  o  * 

From  this  we  see  biQ=  bf(J0)  ~ 

(1-  a  )a.  >  (1-  a  )a,  . 

J0  xo 

The  claim  follows  from  the  fact  that  if 
f (k)  >  iQ  for  ke  (i0+l,...,m)  then  since 
f(iQ) >  1Q  we  must  have  f(k^)  *  f(k2)  for 
some  klt  k2  e  U0+l, . . .  ,m  ).  This  contra¬ 
dicts  the  assumption  that  f  is  1-1. 
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Case  2)  f-1(i  ) >  in. 

O  U 

The  proof  is  analogous  to  that  of  case  1. 

This  fact  allows  a  measure  of  the  quality  of  the  fit  to  be 
made.  If  any  pair  of  corresponding  components  fail  to  match  to 
within  a-percent  then  P  can  not  be  an  a-percent  noisy  pattern  of  P  . 
Any  of  the  standard  measures  of  distance  between  vectors,  such  as 
Euclidean  distance,  can  be  computed,  resulting  in  a  measure  of  the 
goodness  of  fit. 
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$  4. 2  Example  Using  SIDV  Algorithm 

In  this  section,  we  demonstrate  how  SIDV  is  used  to  classify 
patterns.  Fig  4-l(a)  shows  an  ideal  pattern  (solid  circles) 
called  A  and  an  a-percent  noisy  version  of  that  pattern  (open  circles) 
called  N.  Fig  4-l(b)  shows  a  different  ideal  pattern,  called  B. 

The  interpoint  distances  of  each  of  the  three  patterns  are 
calculated  and  sorted.  A  component  by  component  difference  is 
then  computed.  Fig  4-2(a)  shows  ideal  pattern  A  compared  with 
noisy  pattern  N.  The  third  column  is  the  absolute  difference 
between  their  sorted  interpoint  distance  vectors.  Fig  4-2 (b) 
shows  ideal  pattern  B  similarly  compared  with  noisy  pattern  N. 

Suppose  a  threshold  for  the  difference  is  selected  and  the 
number  of  differences  greater  than  this  threshold  is  counted.  If 
a  threshold  of  2.0  is  picked,  then  no  differences  greater  than  2.0 
occur  in  the  comparison  of  pattern  A  and  N.  But,  five  differences 
are  greater  than  2.0  in  the  comparison  of  patterns  B  and  N. 

Using  a  threshold  of  2.0,  pattern  N  would  be  classified  as  an 
a-percent  noisy  version  of  pattern  A  and  not  pattern  B.  Fig  4-3 
shows  a  plot  of  the  total  number  of  differences  greater  than  the 


threshold  of  2.0. 


o 


O  • 


o  • 


o 


o 


Fig  4-1 (a).  The  solid  circles  form  ideal  pattern  A.  The  open  circles  form 
pattern  N,  an  a-percent  noisy  version  of  pattern  A. 
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» 


( 


o 

Fig.  4-1 (b).  The  solid  circles  form  ideal  pattern  B. 
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Sorted  Interpoint  Distances 


Ideal  A 

Noisy  N 

Difference 

6.2 

6.0 

0.2 

6.2 

6.  3 

0.1 

8.0 

7.5 

0.5 

8.0 

8.2 

0.2 

8.0 

8.6 

0.6 

8.0 

9.9 

1.9 

12.1 

11.9 

0.2 

12.1 

12.2 

0.1 

13.1 

13.1 

0.0 

13.1 

13.9 

0.8 

Diff 

Threshold>2 


Fig  4-2 (a).  Comparing  SIDV's  of  patterns  A  and  N. 


Diff 


Ideal  B 

Noisy  N 

Difference 

Threshold>2 

6.2 

6.0 

0.2 

6.2 

6.3 

0.1 

8.0 

7.5 

0.5 

8.0 

8.2 

0.2 

9.0 

8.6 

0.4 

14.2 

9.9 

4.3 

/ 

14.2 

11.9 

2.3 

/ 

15.8 

12.2 

3.6 

/ 

15.8 

13.1 

2.7 

/ 

20.6 

13.9 

6.7 

/ 

Fig  4-2 <b). 

Comparing  SIDV's  of 

patterns  B  and  N. 
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Example  of  Using 
SIDV 


number 

of 

comparisons 


number  of  differences 
greater  than  threshold  =  2.0 


Fig  4-3.  Histogram  of  Result  of  Comparing  Noisy  Pattern  N  with  Ideal  Patterns 
A  and  B. 


4-9 


Three  patterns  each  with  twenty  points  were  created,  shown  in 


Fig  4-4(a),  (b),  and  (c),  to  be  used  as  ideal  patterns.  Uniform 
noise  within  a  rectangle  about  each  point  was  used  to  create  three 
noisy  versions  of  each  of  the  three  ideal  patterns,  as  demonstrated 
on  one  point  in  Fig  4-4 (a). 

Each  of  the  noisy  versions  was  compared  with  each  of  the  ideal 
patterns  as  explained  in  Sec  4.2.  The  absolute  difference  between 
corresponding  components  ranged  from  0  to  20.  A  wide  range  of 
thresholds  were  possible  for  perfect  classification.  Using  an 
absolute  difference  threshold  of  4.0,  the  results  are  shown  in 
Fig  4-5.  The  x-axis  shown  the  total  number  of  differences  greater 
than  the  threshold.  Note  that  the  cluster  of  comparing  ideal 
patterns  with  noisy  versions  of  themselves  is  well  separated  from 
the  cluster  of  comparing  ideal  patterns  with  noisy  versions  of 
other  ideal  patterns. 

The  effect  of  deleting  a  point  from  a  noisy  version  of  an 
ideal  pattern  was  investigated  next.  Deleted  points  are  possible 
when  dealing  with  non-planar  terrain  as  discussed  in  Sec  1.4. 

For  each  of  the  three  ideal  patterns,  five  noisy  patterns 
were  generated  but  with  one  point  missing  (a  different  point  each 
time).  The  SIDV's  of  the  noisy  patterns  are  shorter  than  the  SIDV's 
of  the  ideal  patterns  since  the  patterns  have  fewer  points. 

In  order  to  compare  using  corresponding  component  differences, 
the  two  sets  of  SIDV's  must  be  made  ‘the  same  length.  This  was 
done  by  adding  distances  to  the  shorter  SIDV's.  The  added  distances 
were  uniformly  distributed  over  the  range  of  values  in  the  shorter 
list.  (If  a  point  on  the  outskirts  of  an  ideal  pattern  was  deleted. 


I 


I 


Fig  4-4 (a).  Synthetic  pattern  with  twenty  points  used  in  testing  the  SIDV 
classification  procedure.  The  box  around  the  point  represents 
the  amount  of  noise  allowed  for  each  point. 
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number  of  differences  greater  than  threshold  -  4.0 

Fig  4-5.  Histogram  of  result  of  experimentation  comparing  three  synthetic 
patterns  with  noisy  versions  of  themselves.  The  three  patterns 
are  shown  in  Fig  4.4. 


Fig  4-6.  Histogram  of  result  of  experimentation  comparing  three  synthetic 

patterns  with  five  noisy  versions  of  themselves.  The  noisy  patterns 
had  one  deleted  point. 


the  missing  distances  would  range  over  the  possible  distance 
lengths.  Thus  adding  distances  that  are  distributed  over  the  range 
of  possible  distance  lengths  is  like  adding  a  point  to  the  edge  of 
the  ideal  pattern. ) 

Using  the  threshold  of  4.0  it  was  possible  to  correctly  classify 
thirty- four  out  of  the  thirty-six  comparisons.  The  histogram  is 
shown  in  Fig  4-6. 
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5.  Nearest  Neighbor  Algorithm 

The  various  classification  procedures  using  the  sorted  in¬ 
terpoint  distance  vector  provided  good  classification  in  limi¬ 
ted  experimentation,  but  did  not  appear  feasible  if  noisy  pat¬ 
terns  contained  more  than  one  or  two  additions  or  deletions. 
The  primary  problem  in  this  more  general  situation  is  that 
SIDV  methods  involve  matching  a  sequence  to  a  subsequence  of 
another  sequence,  which  can  rapidly  become  computationally  ex¬ 
pensive.  To  overcome  the  combinatorial  problem,  a  nearest 
neighbor  procedure  was  devised  where  a  pattern  consisting  of  n 
points  is  represented  by  a  vector  with  n  components  in  contrast 

to  the  components  required  by  the  SIDV  methods. 

2 

5.1  Theory 

The  vector  of  length  n  assigned  to  a  pattern  by  the  nearest 
neighbor  procedure  consists  of  finding  the  distance  from  each 
point  to  its  closest  neighbor  and  then  sorting  these  distances 
in  increasing  order. 

Def.  Let  P  =  {a1,...,an>  be  a  pattern.  The  sorted  nearest 

neighbor  vector,  denoted  SNN(P),  is  the  n-tuple  whose  i^*1 

component  denoted  SNN(P)^  is  the  i^^1  largest  element  in 

the  collection  min  d  {(a.  ,a1)}^1  . 

i*J  1  J  1=1 
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The  motivations  for  selecting  this  vector  SNN(P)  to  repre¬ 
sent  the  pattern  P  are:  1)  the  size  of  the  vector  is  small,  re¬ 
ducing  combinatorial  problems  in  matching,  2)  an  a-percent  re¬ 
gular  point  classification  procedure  can  be  defined  using  SNN(P). 
Major  disadvantages  of  this  scheme  are  1)  many  dissimilar  pat¬ 
terns  have  identical  sorted  nearest  neighbor  vectors,  2)  the 
sorted  nearest  neighbors  are  mainly  the  relatively  shorter  edge 
lengths  in  the  complete  graph  of  the  pattern  and  short  edge 
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lengths  may  tend  to  vary  by  a  longer  fraction  of  their  length 
than  longer  ones. 


Ex.  5.2 


•  • 

(1,2)  (2,2) 

(2,1) 


(0,0) (1,0) 


(4,1) 

(0,0)(1,0)(2,0)(3,0)(4,0)  (6,0) 


The  two  patterns  shown  above  have  the  same 
SNN  *  (1,1, 1,1, 1,2) 

The  SNN  algorithms  are  identical  to  the  SIDV  algorithms 
except  that,  of  course,  the  SNN  is  used.  As  a  first  step  in 
understanding  the  usefulness  of  the  SNN,  we  examine  its  be¬ 
havior  under  a-percent  perturbations. 

Lemma  I  Let  P  =  {a1,...,an>  be  a  pattern  and  let 

P'  *  {b1,...,bn)  be  an  a-percent  perturbation  of  P 
such  that  b^  corresponds  to  a^  for  i  *  l,...,n.  For 

i  =  l,...,n  let  d^  *  iji^n  d(a^,aj)  and  let 

1  < 

d!  -  min  d(b,,b.).  Then  (1-a  )d±<md[  <(l-*-a)d1  for 
i  i-j  i  J 
l<j  51 

i  *  i, . . . ,n. 

Proof:  Let  l^.i0^n.  Select  such  that  d(bio,bjQ)  *  d^ 
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where 


and  k  such  that  d(a^  ,a.  )  =  d.  , 

o  -Lo  J-o 

dl'o  =  d(bio’bJo)  =  (1-a)d(ai0,aJo)  >  (l-a)d(alo,ako ) 

>  (l-a)dlo 

and 

dio  =  d(bio,bjo)  =d(bio>bkQ)  -  (1‘a  )d  (alo,  ak0 } 

1  (l+a)dlo. 

Thus  (l-a)d.  <d,'  <  (l+a)d<  . 

lo  =  J  o  ’  ^ 

So  the  distance  of  a  point  to  its  nearest  neighbor  can  change 
by  no  more  than  a-percent  under  an  a-percent  perturbation.  U- 
sing  Lemma  I  and  Theorem  I,  we  show  that  the  SNN  changes  com¬ 
ponentwise  by  no  more  than  a-percent  under  an  a-percent  per¬ 
turbation  . 

Thm  II  Let  P  =  {  a^ , . . . , an }  be  a  pattern  and  let  P'  = 

{  . . . ,bn}  be  an  a-percent  perturbation  of  P.  Then 

|  SNN (P ) i  -  SNN ( P * )t  I 

SNNfPTi  1  a 

for  i  =  1, . . . ,n. 

Proof:  We  first  show  that  if  v,we  P  and  w  is  the  nearest 
neighbor  of  v  and  v'  and  w'  are  the  corresponding 
points  in  P',  and  z'e  P*  is  a  nearest  neighbor  of  v’, 
then 

|d  ( v ,  w )  -  d  ( v ' ,  z  ’  )  1 
d  ( v ,  w ) 
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<  a 


I 


Since  P'  is  an  a-percent  perturbation  of  P, 

(l-a)d(v,w)  d(v',w')  (l+a)d(v,w). 

Thus  if  z*  is  the  nearest  neighbor  of  v', 
d(v',z')  <_  d(v*,w‘)  j  (l+a)d(v,w ) . 

Assume  z  is  the  point  in  P  corresponding  to  z '  in  P' . 

Then 

(l-a)d(v,z)  <  dCv'jZ’)  (l+a)d(v,z). 

Since  w  is  the  nearest  neighbor  of  v, 
d(v,w)  <  d (v, z ) . 

Thus 

(l-a)d(v,w)  <  (l-ot)d(v,z)  <_  d(v',z')* 

We  now  have 

(l-a)d(v,w)  <  d(v?,z')  <  (l+a)d(v,w). 

Hence 

|  d(v,w)  -  d(y',z')  |  < 

d(v,w)  =■  a  * 

1 

Without  loss  of  generality,  assume  that  bj_  corresponds 
to  a^.  Let  a£  a  nearest  neighbor  of  a^  and  b|  a 

i 

nearest  neighbor  of  b^.  By  the  above  argument, 

11 1  -<  «• 

I 

Thus  the  theorem  follows  immediately  by  applying 
Theorem  I. 


I 


-fy 


* 
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5. 2  Example  Using  SNN  Algorithm 


In  this  section,  an  example  using  the  sorted  nearest  neighbor 
vector  is  discussed.  Fig  5-l(a),  (b)  and  (c)  show  three  different 
ideal  patterns  (even  though  they  "look"  similar).  Fig  5-2(a)-(c) 
show  the  same  three  patterns  but  with  the  nearest  neighbors  con¬ 
nected  . 

Let  the  distances  to  the  nearest  neighbor  in  each  pattern  be 
sorted  and  then  calculate  the  range  allowed  for  the  nearest  neigh¬ 
bor  distance  using  the  results  of  Thm  II  with  a  =  10SL  The  list 
of  sorted  nearest  neighbors  with  the  allowed  ranges  is  shown  in 
Fig  5-3- 

Now  suppose  the  noisy  pattern  that  is  to  be  classified  has 
the  sorted  nearest  neighobr  distances  NP(l)  thru  NP(5). 

Using  Thm  II,  these  distances  must  satisfy  the  following 
inequalities  in  order  to  be  a  noisy  version  of  pattern  A. 


4.5 

< 

NP  ( 1 ) 

< 

5.5 

9.0 

< 

NP  ( 2 ) 

< 

11.0 

13.5 

< 

NP  ( 3 ) 

< 

16.5 

18.0 

< 

NP(  4 ) 

< 

22.0 

22.5 

< 

NP(5) 

< 

27.5 

Similar  sets  of  inequalities  must  be  satisfied  in  order  for 
a  noisy  pattern  to  classified  as  noisy  versions  of  patterns  B 
and  C . 

If  the  sorted  interpoint  distances  of  the  noisy  pattern  are 
5.7,  9.7,  13.8,  19.2  and  24.0,  then  it  is  a  noisy  version  of 
pattern  B. 
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(a)  Pattern  A 


(b)  Pattern  B 


(c)  Pattern  C 


Fig  5-1.  Three  different  ideal  patterns  used  to  illustrate  SNN  classification. 


(a)  Pattern  A 


(b)  Pattern  B 


(c)  Pattern  C 


Fig  5-2.  The  three  patterns  of  Fig  5-1  are  shown  with  the  nearest  neighbors 
connected. 


Pattern  A 

Pattern  B 

Pattern  C 

5 

(4.5-5. 5) 

6. 

2  (5. 6-6. 8) 

6.5 

(5. 8-7. 2) 

10 

(9.0-11.0) 

10 

(9.0-11.0) 

10 

(9.0-11.0) 

15 

(13.5-16.5) 

14 

(12.6-15.4) 

17.5 

(15.5-18.9) 

20 

(18.0-22.0) 

20 

(18.0-22.0) 

20 

(18.0-22.0) 

25 

(22.5-27.5) 

25 

(22.5-27.5) 

25 

(22.5-27.5) 

Fig  5-3.  Sorted  nearest  neighbor  distances  for  the  three  patterns  plus  the 

allowed  range  of  nearest  neighbor  distances,  in  parentheses,  assum¬ 
ing  10%  noise. 
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In  order  to  test  the  SNN  algorithm,  nineteen  ideal  patterns 
with  twenty  points  each  were  generated.  Each  roughly  covered  the 
same  area  of  100  by  100  units.  The  twenty  patterns  are  shown  in 
Appendix  II. 

The  patterns  were  completely  distinguishable  using  an  a  =10f. 
The  classification  algorithm  is  shown  in  Fig  5-^.  Thus  any  a-nois 
version  of  the  nineteen  patterns  would  be  perfectly  classified. 


SNN  Classification  Algorithm  for  19  Synthetic  Patterns 


(1) 

(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 

(10) 

(11) 

(12) 

(13) 

(14) 

(15) 

(16) 

(17) 

(18) 
(19) 

Fig  5 


It (NP(14)>8.2)then  NP  is  pattern  A. 

If  (NP(12)>8 . 2  and  NP(20)->15.)  then  NP  is  pattern  B. 

If  (NP(5)>8.2  and  NP(20)>18.)  then  NP  is  pattern  C. 

If  (NP(10)>7. 1  and  NP(20)>18.)  then  NP  is  pattern  D. 

If  (NP(16)>11. 7  and  NP(9)>8.  and  NP(20)>30.)  then  NP  is  pattern  E. 

If  (NP (20)>21.  and  NP(17)*16.8  and  NP(7)>8.4  and  NP(8)>8.4  and  NP(16)>12.1) 
then  MP  is  pattern  F. 

If  (NP(9)>7.  and  NP(20)>25.)  then  NP  is  pattern  G. 

If  (NP(20)> 30.  and  8. 2<NP(1)<11.8  and  NP(4)>11.8  and  NP(10)>11.2) 
then  NP  is  pattern  H. 

If  (NP (20)>40 .  and  NP(20)>35.  and  NP(7)>8.2  and  NP(19>25.)  then  NP  is 
pattern  I. 

If  (NP(20)>40.  and  NP(8)>8.2)  then  NP  is  pattern  J. 

If  (NP(20)>34.  and  NP(19)>22.)  then  NP  is  pattern  K. 

If  (NP(19)>26.  and  NP(1)>8.2)  then  NP  is  pattern  L. 

If  (NP(1)>8.2  and  NP (5)>8 . 2) then  NP  is  pattern  M. 

If  (NP(1)>8. 2  and  NP(5)>8.2  and  NP(6)>8.2)  then  NP  is  pattern  N. 

If  (NP(1)>8.2  and  NP(5)>8.2  and  NP(6)>8.2  and  NP(12)>11.8)  then  NP  is 
pattern  0. 

If  (NP(l)->8.2  and  NP(5>8.2  and  NP(6)t-8.2  and  NP(12>11.8)  then  NP  is 
pattern  P. 

If  (NPfl)>ll,8)  then  NP  is  pattern  Q. 

If  (NP(4)>11.8)  then  NP  is  pattern  R. 

Otherwise  NP  is  pattern  S. 

4.  The  SNN  algorithm  for  classifying  a  noisy  pattern  as  one  of  the  nineteen 
synthetic  patterns. 
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6. 


Minimal  Spanning  Tree  Methods 


The  SIDV  and  SNN  do  not  explicitly  encode  any  global  Informa¬ 
tion  a'oout  a  pattern.  The  only  Information  used  In  constructing 
an  SIDV  or  SNN  is  the  distance  between  two  points  and  includes  no 
information  about  the  structure.  Nevertheless  experiment  indicates 
the  potential  of  these  vectors  for  classification.  In  order  to 
explicitly  capture  more  global  pattern  information,  the  minimal 
spanning  tree  of  a  pattern  P,  denoted  MST(P)  was  explored.  A  series 
of  papers  [5,6,7]  developed  applications  of  the  MST  to  clustering, 
bubble  chamber  pictures,  and  noisy  template  matching.  Several 
results,  which  are  useful  in  analyzing  the  MST  methods,  are  given 
in  this  section. 

6 . 1  Graph  Theoretical  Definitions  of  MST's 

Basic  notions  from  graph  theory  [  2  ]  will  be  presented 
in  this  section.  Applications  to  point  pattern  classification  will 
be  given  in  the  next  section.  Graph  theory  is  the  study  of  struc¬ 
tures  composed  of  points  and  lines  connecting  them.  A  graph  is  a 

n 

finite  set  of  points  In  R  and  a  set  of  lines  connecting  some  pairs 
of  points*  For  present  purposes  our  graphs  will  lie  in  R  .  The 
points  in  the  graph  are  called  vertices  and  the  lines  are  called 
edges . 


1 

* 

R  denotes  a  space  of  real  numbers  and  n  denotes  the  dimension  of  the  space. 

I 


Ex.  6.1 


a) 


vertices  =  {a,b} 

edges  =  {(a,b)}  where  eab  denotes  the  edges  Joining 
vertices  a  and  b. 
b) 


vertices  =  { a,b , c , d, e , f } 
edges  =  {eab>ebc >ecd’ede)eef } 

The  set  of  vertices  of  G  is  denoted  V(G)  and  the  set  of  edges  of 
G  is  denoted  E(G).  A  weighted  graph  is  a  graph,  together  with  an 
assignment  of  a  positive  real  number  to  each  edge.  The  weight  of 
a  weighted  graph  is  the  sum  of  the  numbers  assigned  to  the  edges 
of  G. 
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tr 


Ex.  6.2 


The  depicted  graph  is  a  weighted  graph  with  the  weights 
written  next  to  the  corresponding  edges.  The  total 
weight  of  the  graph  is  l+2+3+^+5  =  15. 

A  path  in  a  graph  G  is  a  sequence  of  vertices  and  edges  of 
G,  voelvle2 .  .  .  ekvkwhere  the  e^  (l<_i<^k)  are  edges,  the 
Vi>  (0<i<_k)  are  vertices,  edge  e^  has  endpoints  and  vi_1  for 
e^  /  ej  for  all  i  and  j  and  no  two  vertices  are  the  same  except 
for  possibly  vQ  and  v^.  A  cycle  is  a  path  in  which  vq  =  v^ . 

Ex .  6.3 


a,eab,b,ebc,c,ecd,d  isapath.  a,eab,b,ebc,c,eca  ,  a  is  a 
cycle. 


6-3 


A  graph  is  called  connected  if  given  any  two  distinct  vertices  a 
and  b  of  the  graph  there  is  a  path  Vq  ei  vi  •  •  •  such  that  vc  =  a 
and  =  b.  A  tree  is  a  connected  graph  with  no  cycles. 

Ex.  6.4  The  graph  of  example  6.3  is  connected  while  that  of 
example  6.2  is  not. 

A  spanning  tree  T  of  a  graph  G  is  a  tree  whose  vertices  are  exactly 
those  of  G  and  whose  edges  are  a  subset  (possibly  all)  of  the  edges 
of  G. 


A  graph  G  and  a  spanning  tree  T,  of  G  are  shown  above. 

A  minimal  spanning  tree  of  a  weighted  graph  is  a  spanning  tree  of 
G  such  that  no  other  spanning  tree  of  G  has  less  weight. 


6 . 2  Theory  of  Using  MST's  for  Classification 

■  The  SIDV  of  a  pattern  is  uniquely  determined  by  the  pattern 
and  is  thus  independent  of  the  order  in  which  the  points  In  the 
pattern  are  processed  by  the  algorithms.  This  is,  unfortunately, 
not  true  for  the  MST.  Any  graph  in  which  some  point  has  a  non¬ 
unique  nearest  neighbor  has  at  least  two  MSTfs,  which  need  not 
be  isomorphic. 

Ex.  6.7 

G  =  3 


5  2 


T  and  T 


1 


are  distinct  MST's  for  the  graph  G. 


Lemma  II  Let  P  be  a  pattern.  Let  G  be  the  complete  graph  on  ? 

and  let  S  be  a  set  of  edges  In  G  that  are  known  to  lie 
In  every  MST  for  G.  Let  G  be  the  subgraph  of  G  con- 
sistlng  of  the  edges  In  S-  Let  Cg  =  C^,...,CW  denote 

the  set  of  connected  components  of  s.  Define  the  dis¬ 
tance,  d(CisCj),  between  two  components  Cj_  and  Cj  by 

d (C.  ,C <  )  =  min  L(e ) 

1  J  eeE(G) 

where  e  has  one  end  in  and  the  other  in  Cj .  Let 
f  eE(G)-s.  Assume  there  exists  an  i,j  such  that 
d(C1}Cj)  =  L  ( f ) 

and  f  has  one  end  in  and  one  end  in  Cj  and 
d(C^,Cj)  _<  d ( ) 

for  k  /  i,j  and  for  all  geE(G)  such  that  g  has  one  end 
in  and  the  other  in  Cj  ,  we  have  g/f  implies  L(f)<L(g). 
Then  f  is  in  every  MST  of  G. 

The  above  result  provides  information  on  the  possible  var¬ 
iations  among  MST's  for  a  given  point  set.  Having  obtained  some 
understanding  of  this  variability,  we  now  turn  to  the  variability 
in  MST's  for  an  a-percent  perturbation  of  a  point  pattern.  If  one 
can  determine  a  set  of  edges,  E,  contained  in  every  MST  of  a  point 
pattern  P,  such  that  any  a-percent  perturbation  P'  of  P  contains 
a  subgraph  consisting  of  an  a-percent  perturbation  of  the  edges  of 
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E,  then  we  have  a  potentially  powerful  tool  fur  matching  rat  terns. 


Ex.  6.8 


Pattern  P 


The  edges  e-^,  e23>  e3i)  are  anF  ’’1ST  pattern  P 
by  Thm  III.  Furthermore  as  we  will  show  in  Thm  IV, 
any  10-percent  perturbation  P'  of  P  will  have  the  fol¬ 
lowing  property.  Given  any  MST  M  of  p',  there  exists 
3  edges  eab,  ebc,  and  eccj  where  a,b,c,  and  d  are  ver¬ 
tices  of  M,  no  two  of  which  are  equal  and 

1  (e„h)  -  (en  0)  | 

^?1~2T  "  °-1 


7  .je23)  I  <  o.l 

(e237 

liilcdi.  J  <  o.l 

f  e34? 


Thus  given  a  pattern  Q,  one  test  that  can  be  applied  to 
determine  If  it  could  be  an  10-percent  perturbation  of 
P  is  to  compute  an  MST  for  Q  and  determine  if  M  has  3 
adjacent  edges  satisfying  the  above  Inequalities. 
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If  such  equal  edge  lengths  are  the  only  sources  of  non-uniqueness 
in  MST's  then  one  might  determine  the  extent  to  which  this  non¬ 
uniqueness  hinders  matching.  We  now  show  that  the  nearest  neighbor 
equal  edges  lengths  are  the  only  sources  of  non-uniqueness  in  MST 

Thm  III  Let  P  be  a  pattern,  Let  G  be  the  complete  graph  on  the 
pattern  and  let  M  be  a  minimal  spanning  tree  of  G .  If  evw 
is  an  edge  of  G  such  that  either 


1) 

d(v,w)  j< 

d(v,z) 

for  zeV(G),  z/v,  z?*w 

or 

2) 

d(v,w)  ^ 

d (w, z ) 

for  zeV(G),  z/v,  z?<w  . 

Then  e  e  E (M ) . 
vw 

Proof:  Assume  e„Tt  t  M.  Then  M  U  {e  }  is  a  graph  with 

vw  vw 

Jt 

exactly  one  cycle  and  this  cycle  contains  the  edge  evw> 

The  cycle  must  contain  edges  f,g  f/g  such  that  f  is  in¬ 
cident  on  w  and  f^e„,.  or  g/e,„,.  If  such  edges  did  not 

vw  vw 

exist,  M  would  not  be  connected.  G'  =  M  U  {evw>  -  (f)  Is 
a  connected  graph  since  removing  an  edge  from  a  cycle  can 
not  disconnect  It.  Furthermore  L(evw)  <L(f)  where  L  de¬ 
notes  the  weight  of  the  edge.  Thus  LCG)^  L(M),  contradict¬ 
ing  the  minimality  of  M.  Hence  evw  e  M. 

We  can  show  that  this  procedure  for  deciding  which  edges 
can  be  in  an  MST  can  be  iterated.  As  a  corollary  of  this  result, 
we  will  see  that  if  no  two  interpoint  distances  in  a  point  pattern 
are  the  same,  the  MST  is  uniquely  determined  [5]. 


*U  denotes  the  operation  of  set  Union. 
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Thm  IV  Let  P  be  a  pattern,  let  G  be  the  complete  graph  on  P. 


Let  a  be  a  real  number  greater  than  zero  and  let  P'  be  an 
a-percent  perturbation  of  P.  Let  G'  be  the  complete  graph 
on  P*  and  let  M  be  a  MST  for  G'.  If  e  is  an  edge  of  G 
such  that  either 

1)  d ( v, w)  <  d (v, z )  for  zeV(G),  z/v,  z^w 

or 

2)  d(v,w)  < d(w, z)  for  zeV(G),  z/v,  z/w. 

Then  eVVfeE(M). 

This  follows  immediately  from  Theorems  I  and  III. 

The  MST  of  a  point  pattern  can  be  used  for  many  types  of 
pattern  matching.  We  have  experimented  with  three  procedures 
1)  matching  of  degree  sequences,  2)  matching  of  longest  linear 
paths  and  3)  matching  of  adjacent  pairs  of  edges.  The  first  and 
third  procedures  depend  only  upon  the  measurement  of  the  edges 
incident  upon  a  point.  Thus  these  procedures  are  In  the  spirit  of 
the  SIDV  or  the  SNN  In  that  they  attempt  to  match  whole  patterns 
by  requiring  a  set  of  local  measurements  to  match  up.  In  contrast, 
to  the  earlier  procedures,  the  measurement  generally  takes  into 
account  at  least  two  edges  simultaneously,  thus  providing  a  slightly 
more  global  set  of  local  operations.  Modifications  of  the  adjacent 
edges  permit  one  to  take  Into  account  far  more  global  structure  into 
the  local  operations  but  at  greater  computational  expense. 
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6 . 3  Examples  of  MST's 

Minimal  spanning  trees  were  generated  for  the  nineteen  ideal 
patterns  used  for  experimentation  in  the  nearest  neighbor  algorithm. 
They  are  shown  in  Appendix  III.  One  noisy  version  of  each  ideal 
pattern  was  calculated  and  their  MST's  are  shown  in  Appendix  IV. 

It  is  easy  to  see  that  the  MST  does  change  under  noise  but 
that  the  global  structure  seems  to  remain  stable. 

6 . 4  Experimentation  with  MST’s 

Experimentation  of  classification  using  the  MST  algorithm  is 
incomplete  at  the  end  of  this  phase  of  the  investigation. 

Attempts  to  classify  using  matching  of  degree  sequences  or  matching 
using  longest  linear  paths  did  not  work  on  the  patterns  discussed 
above . 

A  degree  of  a  vertex  (point)  in  a  MST  is  the  number  of  edges 
coming  out  of  that  node.  A  degree  sequence  of  a  MST  is  then  the 
sorted  list  of  degrees  of  the  points  in  the  MST.  Fig  6-1  shows 
the  degree  sequences  of  the  MST’s  for  the  ideal  patterns  and  their 
noisy  patterns.  There  was  not  enough  of  a  match,  in  general, 
between  the  degree  sequences  of  an  ideal  pattern  and  its  noisy 
version  to  devise  a  classification  scheme. 

The  next  attempt  in  finding  a  classification  scheme  used  long¬ 
est  linear  paths.  Longest  linear  paths  are  found  by  starting  at  a 
vertex  of  degree  one,  and  following  edges  till  reaching  a  vertex  of 
either  degree  one,  or  greater  than  degree  two.  The  number  of  edges 
in  the  path  is  its  length.  Fig  6-2  shows  the  sorted  longest  linear 
paths  for  the  ideal  patterns  and  their  noisy  patterns.  Again,  a  clas 
ification  scheme  using  the  longest  linear  paths  was  not  possible. 
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The  final  attempt  was  to  match  MST's  by  their  adjacent  edge 
lengths.  For  this  technique  all  possible  adjacent  edge  lengths  of 
the  tree  are  found  and  sorted.  Then  classification  is  done  by 
deciding  whether  the  sequences  from  two  trees  match. 

The  comparison  of  adjacent  edge  lengths  is  much  more  complicated 
than  the  comparison  of  distances  done  in  the  SIDV  and  SNN  algorithms. 
In  SIDV  and  SNN  algorithms  each  component  of  the  sequence  was  com¬ 
posed  of  only  single  numbers  and  there  was  an  obvious  way  to  sort 
and  compare  them.  Here,  in  the  adjacent  edge  length  case,  each 
component  is  composed  of  two  numbers  and  the  method  of  sorting  and 
comparing  is  not  as  obvious. 

Testing  of  this  method  is  not  complete,  but  preliminary  results 
show  that  this  method  might  give  a  good  classification  scheme. 


6—1*1 


7. 


Patterns  Specified  by  a  Set  of  Constraints 

The  point  patterns  considered  in  sections  3-6  were  perturba¬ 
tions  of  some  ideal  configurations  of  points.  The  perturbations 
considered  were  subject  only  to  the  restriction  that  no  interpoint 
distance  should  change  by  more  than  a  fixed  percentage.  We  now 
describe  an  Important  class  of  point  patterns,  which  we  call  con¬ 
strained  patterns  which  include  the  a-percent  perturbation  model. 
Unlike  the  a-percent  perturbations  models,  the  constrained  patterns 
do  not  admit  a  simple  class  of  general  recognition  algorithms, 
though  we  present  an  example  to  illustrate  how  certain  realistic 
instances  can  be  handled  quite  easily. 

A  constrained  pattern  is  merely  a  set  of  conditions  which  can 
be  tested  on  a  point  set  in  the  plane.  If  a  point  set  satisfies 
all  the  conditions  it  is  called  an  instance  of  the  pattern.  Dif¬ 
ferent  classes  of  constrained  patterns  can  be  defined  depending  on 
the  nature  of  the  conditions.  Examples  of  useful  conditions  include 
restrictions  on  interpoint  distances,  angles  formed  by  lines  join¬ 
ing  certain  points,  total  size  of  the  pattern  and  minimum  distance 
between  certain  points.  Little  can  be  said  about  the  problem  at 
this  level  of  generality. 
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7. 1  An  Example 


We  now  give  an  example  of  a  pattern  closely  related  to  an 
existing  real  system  .  The  pattern,  shown  in  Fig  7-1,  consists 
of  seven  points.  Three  points  form  the  vertex  of  an  equilateral 
triangle  of  side  length  50  feet  at  the  vertex  of  the  V.  None  of 
the  remaining  four  points  can  be  closer  than  150  feet  from  one 
another  or  to  any  of  the  three  points  at  the  vertex.  It  must  be 
possible  to  superimpose  the  shaded  V  of  Fig  7-2  on  the  pattern  in 
such  a  way  that  the  mean  of  the  three  vertex  points  lies  at  the 
vertex  of  the  V,  and  regions  A  and  B  contain  two  points  each.  An 
algorithm  that  gives  a  complete  solution  to  the  recognition  problem 
for  this  pattern  is  presented  below.  Any  set  of  points  satisfying 
the  constraints  of  the  pattern  will  be  labeled  by  the  algorithm  as 
an  instance  of  the  pattern.  We  first  give  an  informal  statement  of 
the  algorithm  which  we  call  Algorithm  A. 

Algorithm  V  (Informal) 

1)  Locate  the  vertex  of  the  V  by  finding  three  points,  a,  b, 
c,  whose  interpoint  distances  are  all  50  feet.  Find  the 
mean  m  of  these  three  points. 

2)  Project  all  points  other  then  a,  b,  and  c  onto  the  unit 
circle  with  center  m. 


Compute  the  angles  v^,...,v^  formed  by  the  consecutive 
points  on  the  circle.  If  exactly  one  of  the  v^  is  greater 
than  150°,  then  state  this  is  not  a  V  and  terminate. 

3)  Renumber  the  subscripts  of  the  v^'s  cyclicly  if  necessary 
so  that  v^-Vj  <_  150°  . 

4)  Test  the  v^'s  for  the  satisfaction  of  inequalities  required 
by  the  V  pattern  (to  be  discussed  next).  If  all  the 
inequalities  are  satisfied,  an  instance  of  the  pattern 

has  been  found,  else  the  points  do  not  form  an  instance 
of  the  pattern. 

We  now  give  a  formal  definition  of  they  pattern. 

Def.  A  V  pattern  is  a  set  P  *  {a1,...,ay}  of  seven  points  in 
2 

R  where  the  coordinates  of  the  point  a.^  are  (xi,yi).  The 
set  P  must  satisfy  the  following  conditions: 

1)  d(a^,a^)  =  d(a^,a^)  =  d(ag,ay)  =  50  where  d  denotes 
Euclidean  distance 

2)  Let  m  be  the  point  with  coordinates  (m^.n^)  where 

m  =  1/3 (x^+Xg+Xy ) 

m  =  l/3(y5+y6+y7) 

For  i  =1,2, 3,4,  d (ai ,m) £  1000 

3)  For  i  =1,2, 3, 4,  j  =  1 . 7  d(ai,aJ)^150 

4)  The  following  figure  can  be  superimposed  on  the  plane 
In  such  a  way  that  points  0  and  m  coincide  and  points 

and  a^  lie  in  one  shaded  region,  while  points  a^ 
and  a^  lie  in  the  other. 
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Pig  7-1 •  An  example  of  the  V  pattern. 


The  curved  segments  are  circular  arcs  from  a  circle  of 
radius  1000  and  center  0. 

We  now  use  the  above  definition  to  define  a  set  of  inequalities 
which  can  be  used  to  test  whether  the  sectors  described  above  can 
be  imposed  on  a  set  of  points.  Assume  we 

have  an  instance  of  the  V  pattern.  Relabeling  our  points  if 

necessary  we  may  assume  A(a^,a2),  A(a^,a^),  A(a^,a^) 

are  less  than  150°  and  d(m,ai)<1000  feet  for  i  =  1,...,4. 

Clearly  0°  <_  A(a^ ,a2 )  <_  30° 

0°  <  A(a3>alj )  <  30° 

90°  <_A(a2,a^) 

A (a  ,a  )  < 150° 

1  H  ~ 

Conversely  we  shall  show  that  if  these  inequalities  hold  for  points 
a^a^a^  and  a^  then  we  have  a  V  pattern.  A(ai,aj)  denotes  the  angle 
formed  by  a^,  0,  and  aj . 
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Algorithm  for  detecting  V  pattern  (Formal): 

Basic  V  pattern  -  three  points  at  vertex,  two  or.  each  side  of  V 
I  Compute  the  interpoint  distance  matrix  X(I,J)  I  = 

J  =  1,...7  where  7  is  the  number  of  points  in  the  pattern 
and  X(i,j)  is  the  distance  from  point  I  to  point  J. 

II  In  the  upper  triangular  matrix  {  X(I,J)}  I  =  1,...,N-1, 

J  =  1+1,... ,N  compute  the  number,  K,  of  values  X(I,J) 
satisfying  40<  X(I,J)<0.  If  K  ?  3  terminate  the  algor¬ 
ithm.  The  pattern  is  not  a  V.  If  K  =  3  and  X(Iq,Jq), 
XdijJf),  and  X(I2,J2)  are  the  three  values  found  then: 

If  the  set  {  IQ  ,  JQ  ,  I  ,  ,  I2  ,  }  contains  exactly 

three  distinct  values  go  to  step  3;  else  stop.  The 
pattern  is  not  a  V. 

Ill  Let  a,b,c,  be  the  three  distinct  values  found  in  the  set 
S.  Set  m  =  l/3(a^+bj+c^Ja2+b2+C2 )  where 
point  a  has  coordinates  (a^,a2) 

"  b  "  "  (b1,b2) 

"  C  "  "  (Ci.Cg) 

(M  is  the  vertex  of  the  V) 

IV  Let  T  =  {  Y  .... .Y^}  be  the  set  of  points  in  the  pattern 
excluding  points  a,  b  and  c. 

Let  e^  denote  the  line  segment  joining  M  and  Y  for 
i  =  1, ...  ,4. 

Let  ^  denote  the  angle  in  the  counterclockwise  direction 
from  a  horizontal  line  through  to 

If  any  length  (e^)  is  greater  than  1000  terminate  with 
failure. 
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V  Sort  the  values  f^,  i  *  1 , .  .  .  ,  4  in  increasing  order. 
Denote  the  sorted  angles  by  . v^. 

Compute  the  values  {  and  v^-v,, .  Denote 

this  set  by  Q. 

If  exactly  one  element  of  Q  is  greater  than  150°  go  to 
step  VI;  else  terminate  with  failure. 

VI  Renumbering  the  subscripts  of  the  v  if  necessary,  we  may 
assume,  without  loss  of  generality,  that  v_^-v^  Is  the 
one  value  greater  than  150°. 

VII  Assume  N  «  7 


If 

v2~vl  I 

30 

and 

V  -v  > 

90 

3  2- 

and 

VV3- 

30' 

and 

VV1- 

150' 

then  terminate  with  success. 
Otherwise  terminate  with  failure. 
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From  the  definition  of  the  V  pattern,  it  can  easily  t  shown 
that  any  instance  of  this  pattern  will  be  recognized  by  the 
algorithm.  We  now  show  that  the  algorithm  claims  detection 

for  a  restricted  V  pattern  only  if  it  truly  is  a  V  pattern.  To 
see  this,  it  suffices  to  show  that  a  V  can  be  superimposed  on  the 
pattern  if  the  restrictions  on  angles  given  in  step  VII  of  algorithm 
V  are  satisfied.  Thus  we  assume  we  have  a  point  in  which  will  be 
the  vertex  of  out  V  and  four  points  a^ja^ja^  and  a^ . 


a4 


Let  V^,V2,  and  be  the  angles  shown  in  Fig  7-3  <  We 
0°  <_  V3  _<  30°  ,  V2  >  90°  and  v1+v2+v3  <  150°  .  Let  V  denote 
region  in  Fig  7-4 . 


assume  0°£V1£30° 
the  shaded 
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Fig.  7-k. 


h 

1 

Suppose  we  superimpose  Fig  7-3  on  Fig  7-4  by  placing  point  0  on 
point  M  and  line  Ohp  on  Ma^  .Since  Vi+V2+V3  150° ,  none  of  the 
points  3]L,a2>a-  and  can  lie  outside  the  angle  h20h2’  since 
V  £30°,  and  h-^Ot^  =  3°°,  both  and  will  lie  in  a  shaded 
region.  If  a^  and  a^  lie  in  the  other  shaded  region,  we  are  done 


If  not,  then  must  lie  in  the  angle  ho0h^. 
h3 


0 


Since  V  ;<  30°  both  and  a^  lie  in  angle  h^Oh^ .  If  a^  and  a^  lie 
in  angle  h^Oh^  we  are  done.  Neither  a^  nor  a,,  can  lie  outside  angle 
h^Oh^  since  we  shifted  clockwise  by  less  than  30°.  Thus  the  only 
way  or  a^  can  fail  to  lie  in  angle  h^Oh^  is  if  a 2  lies  in  angle 
h  Oh  .  But  if  this  happens  then  V  is  less  than  90°.  This  is  a 
contradiction  since  we  assumed  >_  90°.  Thus  we  can  superimpose 
a  V  pattern  on  the  points. 


8. 


Conclusions 


Two  classes  of  point  patterns  were  investigated.  The  first 
class  was  the  prototype  pattern  with  a  noise  model.  We  feel  that 
classification  of  these  patterns  can  be  accomplished  using  the  dis¬ 
cussed  methods,  SIDV,  SNN  and  MST,  under  certain  restrictions.  If 
the  number  of  points  in  the  pattern  is  small,  less  than  100,  then 
the  SIDV  method  would  be  better.  If  the  number  of  points  in  the 
pattern  is  large,  more  than  100,  then  the  SNN  or  MST  methods  would 
be  more  useful. 

The  other  restriction  concerns  the  relative  number  of  additions 
or  deletions  of  pattern  points.  If  the  percentage  of  additions  or 
deletions  is  small,  for  example  on  the  order  of  10%,  then  the  SIDV, 
SNN  and  MST  methods  are  still  feasible.  If  the  percentage  is  large, 
then  other  methods  should  be  used.  (See  Sec  9.2.) 

The  techniques  used  to  classify  prototype  point  patterns  all 
gave  a  measure  of  the  quality  of  the  match.  This  measure  is  useful 
since  it  can  be  used  to  provide  a  measure  of  the  confidence  on  the 
classification  of  a  particular  pattern. 

The  second  class  of  point  pattern  investigated  was  a  pattern 
specified  by  a  set  of  constraints.  Due  to  lack  of  data,  only  one 
example  was  studied.  An  algorithm  that  could  recognize  any  instance 
of  this  pattern  was  successfully  developed.  Generalization  of  this 
technique  was  impossible  without  more  data.  (See  Sec  9*3.) 
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9. 


Recommendation  for  Future  Investigation 


9.1  MST 

The  experimentation  using  the  adjacent  edge  lengths  of  the  MST 
for  classification  should  be  finished.  Variations  of  this  method 
which  take  into  account  the  angles  of  the  MST  could  also  prove  very 
useful . 

Classification  using  the  adjacent  edge  lengths  is  done  by 
sorting  the  components,  which  consist  of  pairs  of  edge  lengths,  and 
comparing  the  components.  This  method  could  be  expanded  by  includ¬ 
ing  the  angle  between  the  adjacent  edge  lengths.  The  components, 
which  now  consist  of  a  pair  of  edge  lengths  and  an  angle,  are 
sorted  and  compared  componentwise. 

9 . 2  Additions  and  Deletions  of  Pattern  Points 

The  methods  reported  for  classification,  the  SIDV  and  SNN, 
were  very  successful  for  relatively  few  additions  or  deletions  of 
pattern  points.  These  methods  are  particularly  attractive  because 
they  are  simple  and  fast  to  compute.  However,  as  the  number  of 
additions  or  deletions  of  points  become  large  relative  to  the 
number  of  points  in  the  pattern,  then  classification  using  the  SIDV 
or  SNN  could  become  computationally  expensive.  In  order  to  evaluate 
these  methods  more  thoroughly,  precise  information  about  the  prob¬ 
ability  of  additions  or  deletions  of  points  needs  to  be  provided. 

If  this  probability  turns  out  to  be  a  significant  proportion, 
then  classification  using  more  statistical  type  of  information 
should  be  investigated.  For  example,  one  class  of  statistical 
properties  that  is  independent  of  translation  and  rotation  Is  the 
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moments  of  the  pattern.  The  ij  moment  of  the  pattern  is  defined 

as  mjj  =  Ex^y^  .  To  make  the  moments  independent  of  the  choice  of 

axis,  the  origin  should  be  moved  to  the  center  of  gravity  of  the 

pattern,  defined  by/^10,  ^01] where  N  is  the  number  of  points  in  the 

\N  N  / 

pattern.  The  x-axis  is  then  chosen  to  be  the  principal  axis  which 
is  the  line  through  the  center  of  gravity  about  which  the  spread  of 
the  pattern  points  is  least.  Odd  moments,  now  defined  with  respect 
to  the  new  coordinate  system,  will  give  a  measure  of  the  balance  of 
points  between  left  and  right,  or  up  or  down.  Even  moments  will 
give  a  measure  of  the  spread  of  pattern  points  away  from  the  axis. 


9 . 3  General  Approach  to  Pattern  Specified  By  Constraints 
Although  an  algorithm  for  recognizing  a  pattern  specified  by 

constraints  (the  V-pattern)  was  easily  devised,  it  was  specific  to 
that  pattern.  In  order  to  devise  more  general  classification  schemes, 
many  more  patterns  specified  by  constraints  need  to  be  studied. 

The  classification  schemes  would  probably  utilize  spatial  and 
angular  relationships  among  the  points. 

9 . 4  Testing  of  Classification  Schemes 

Testing  of  all  the  devised  classification  schemes  should  be 
performed  on  actual  data  from  point  patterns  in  order  to  make  a 
final  determination  of  their  relative  usefulness. 
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Appendix  I 


Formal  Definition  of  Problem 


Def  A  pattern  is  a  finite  set  of  points  in  R  . 

Def  Let  S  =  {P,,...,P  }  be  a  set,  such  that  each  element, 

P  ,  of  S  is  a  finite  set  of  points  in  the  plane.  Denote 

the  elements  of  the  set  PJ  by  {  a. . , . . . , a. ,  . . . }  where 

i  il  ik(i) 

k(i)  denotes  the  number  of  points  in  P^.  For  l^<i<  r, 
l<j  <k(i),  let  a^j  =  ( xi  j  * y  i  j  )  where  and  y  ±  j  are 

the  cartesian  coordinates  of  the  point  a . . .  The  set  S 

ij 

will  be  referred  to  as  the  set  of  ideal  patterns  and  each 


element  P^  of  S  will  be  called  an  ideal  pattern. 

Def  Two  patterns  A  and  B  are  said  to  be  equivalent  if  there 

2 

exists  a  translation,  rotation,  and  reflection  of  R  such 
that  some  product  of  those  induces  a  1-1  map  of  the  point 
set  A  into  the  point  set  B. 

Def  A  point  pattern  classification  procedure  Ts  for  a  set 
S  =  {P  ,  ...,Pr>  ideal  patterns  is  a  procedure  realizing 
a  function  f^s’  N  2  where  N  is  the  set  of  all 

finite  point  sets  in  the  plane  and  i'»->®  soi¬ 


ls  the  set 


of  Integers  (l,...,r  }.  Intuitively,  given  a  pattern  P 

in  N,  fm  (P)  consists  of  the  indices  of  the  ideal  patterns 
s 

which  could  possibly  match  P. 

Def  A  point  pattern  classification  procedure  Ta,  S=  {  P-.  ,  .  . .  P  } 

-  — 1  5  1  p 

is  called  regular  if  for  each  P  in  N  such  that  P  is  equiva¬ 
lent  to  P  for  some  i  in  {l,...,r},  we  have  ie  f.  (P).  Thus 
l  is 

a  regular  point  pattern  classification  procedure  always  in- 
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eludes  a  pattern  P.^  as  a  possible  match  for  a  pattern  P  if 
P^  can  be  obtained  from  P  by  translations,  rotations ,  and 
reflections . 

Def  A  noise  model  M  for  a  set  S  *  {  P  , . .  . ,Pr}  of  ideal  patterns 
is  a  function  M:  S  •>  2^  where  2^  is  the  set  of  all  subsets  of 
the  set  N  of  finite  point  sets  in  the  plane.  Thus  M  assigns 
to  any  pattern  P^  in  S,  a  set  of  patterns.  We  call  any  ele¬ 
ment  of  this  set  a  noisy  version  of  P^. 

Def  The  a-percent  noise  model  for  an  ideal  set  of  patterns 
S  ={P1,...,Pr}  and  a  ^ 0  is  the  noise  model  for  S  where  for 

any  1  <  i  <  r,  M  (P.,  )  is  defined  as  follows: 

"  ”  a  x 

A  pattern  P  is  an  element  of  M  (P. )  iff 

a  3 

1)  The  cardinalities  of  the  two  sets  Pi  and  P  are  the 
same,  i.e.  |  P  (  =  (  PA  (  . 

2)  If  Pj,  =  {  a ^ , . . . , am>  and  P  ={b^,...,b  }  then  there 

is  a  1-1  function  g:  {  1, . . . ,m}+  {  1, . . . ,m}  such  that 

for  all  1  <  k,n  <  m 

jd(ak,a  )  -  d(b  ,b  .  .  | 

K  n _ g (k )  g(n )  <  a 

d (ak»an) 

where  d(a,b)  denotes  the  Euclidean  distance  between 
points  a  and  b. 

Intuitively,  M  assigns  to  a  pattern  P  those  patterns  in  S 
which  can  be  obtained  by  perturbing  the  interpoint  distances 
in  P  by  no  more  than  a-percent. 
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Def  An  g-percent  regular  point  pattern  classification  procedure 
R  for  an  ideal  set  of  patterns  S  ={P1,...,Pr)  is  a  regular 
point  pattern  classification  procedure  such  that  if  fps  is 
the  function  corresponding  to  Rs,  then  Pe  M(P^)  implies 
ie  fp^(P)  for  1<  i  ^r.  Hence  such  a  classification  procedure 
is  merely  a  procedure  which  designates  P^  as  a  possible  match 
for  P  if  their  interpoint  distances  match  to  within  a-percent. 


<' 
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Appendix  2 

Nineteen  ideal  patterns  of  twenty  points  each  used  to  test 
the  SNN  and  MST  techniques 


Appendix  3 

Minimal  spanning  trees  of  the  ideal  patterns  in  appendix  2 


Appendix  4 

Minimal  spanning  trees  of  noisy  versions  of  the  patterns  in 
appendix  2 
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7650  Convoy  Court 
P.  0.  Box  80817 
San  Diego,  CA  92138 
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PRIVATE  INDUSTRY  -  NONPROFIT 


Institute  for  Defense  Analysis 
ATTN:  Dr.  L.  Biberman  (IDA/STD) 
400  Army-Navy  Drive 
Arlington,  VA  22202 


Massachusetts  Institute  of  Technology 

Lincoln  Laboratory 

ATTN:  Dr.  T.  John 

P.  0.  Box  73 

Lexington,  MA  02173 


The  Rand  Corporation 
ATTN:  Dr.  L.  S.  Mundie 
1700  Main  Street 
Santa  Monica,  CA  90 406 


Institute  for  Defense  Analysis 
ATTN:  Dr.  H.  Wolfhard 
400  Army-Navy  Drive 
Arlington,  VA  22202 


Institute  for  Defense  Analysis 
ATTN:  Dr.  E.  Bauer 
400  Amy-Navy  Drive 
Arlington,  VA  22202 


6 


